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20.  AI|TRACT  (Continue  on  mmtm  litft  I#  md  Identify  by  block  number) 

Hampel's  Influence  function  (J.  Amer.  Statist.  Assoc.  £9,  pp.  383-393) 
has  been  used  In  recent  years  for  evaluating  robust  estimators,  detecting 
outliers,  computing  asymptotic  variances  for  estimators,  and  for  hypothesis 
testing.  In  this  report,  the  use  of  Influence  functions  for  various  parameters 
is  proposed,  not  only  as  a  tool  for  outlier  detection,  but  also  as  a  method  for 
replacing  outliers  and/or  missing  observations.  This  approach  is  illustrated 
on  the  estimation  for  the  variance  of  a  distribution  and  it  is  pointed  out  how 
the  method  can  be  applied  in  simple  linear  regression  problems.  This  approach 
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1.  INTRODUCTION 


Hampel  (Ref.  1)  Introduced  the  Influence  function  as  a  tool  for  assessing 
robust  estimators.  In  a  multivariate  population,  an  Influence  function  can 
usually  be  defined  for  estimators  of  parameters.  This  influence  function  can 
be  used  to  determine  where  In  the  n-dlmenalonal  space  of  observations  the 
observed  vector  would  have  a  large  effect  on  the  value  of  the  estimator  of  the 
parameter. 

For  many  parameters,  the  analytic  form  of  the  Influence  function  can  be 
derived  (e.g. ,  the  mean,  the  variance,  the  bivariate  correlation  coefficient, 
and  the  multiple  correlation  coefficient).  In  other  cases.  It  is  difficult  to 
obtain  a  closed-form  expression  for  the  Influence  function.  In  such  cases,  an 
empiric  estimate  may  be  useful. 

Many  agencies  of  the  Federal  government  maintain  large  data  bases  and 
publish  reports  containing  statistical  information  (e.g.,  the  Bureau  of  the 
Census,  the  former  Department  of  Energy,  and  the  National  Bureau  of  Stan¬ 
dards).  These  agencies  use  outlier  detection  and  data  editing  methods  as 
quality  control  measures  for  their  data  bases.  Also,  the  Department  of  Energy 
has  had  an  extensive  program  for  reviewing  several  of  Its  data  bases.  In  this 
data  validation  program,  some  new  approaches  for  detecting  outliers  were 
Introduced  Including  the  use  of  Influence  functions.  (Chernlck  [Ref.  21) 

At  some  of  these  agencies  (particularly  the  Bureau  of  the  Census)  much 
research  has  been  conducted  on  the  replacement  of  observations  (commonly 
called  Imputation).  These  techniques  often  rely  on  other  related  data  to 
obtain  a  reasonable  estimate  as  a  replacement  for  the  "bad"  value.  Also,  In 
some  cases,  the  method  is  designed  so  as  to  avoid  inducing  a  large  bias  on  the 
estimate  of  a  particular  parameter.  Unfortunately,  many  of  these  data  bases 
serve  several  purposes  and  a  favorable  procedure  for  one  estimate  may 
adversely  affect  estimates  of  other  parameters  which  are  important  to  dif¬ 
ferent  users  of  the  data  base.  Consequently,  Influence  functions  can  play 
an  Important  role  in  the  maintenance  and  validation  of  large  data  bases. 
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To  give  a  formal  definition,  we  point  out  that  the  Influence  function 
depends  on  the  distribution  function  F  of  a  random  observation  vector,  the 
parameter  of  Interest,  which  is  commonly  written  T(F),  and  the  observation 
vector.  The  parameter  is  considered  as  a  functional  T(F)  of  the  distribution 
function  F.  The  influence  function  is  defined  by  the  following  equation 
whenever  the  limit  on  the  right  hand  side  exists: 

T((l-e)  F  +  e  fix)  -  T(F)  ) 

I(F,  T(F),  x)  -  lim  - - - - - 

e-*0  c 

where  e  is  a  positive  real  number,  x  is  a  point  of  interest  in  the  observation 
space,  and  fi  x  is  the  distribution  function  with  all  its  probability  mass  con** 

A* 

centrated  at  x. 

The  influence  function  is  approximately  equal  in  large  samples  to  n  times 
the  difference  between  the  estimator  with  an  observation  at  x  included  and  the 
estimator  with  the  observation  at  x  excluded  where  n  is  the  sample  size.  This 
can  be  seen  by  replacing  F  with  Fn_j  (the  empiric  distribution  function  for  a 
sample  of  size  n-1)  and  approximating  the  limit  as  e  tends  to  0  by 
replacing  e  with  ~  , 

-1  ** 

because  (2_i.)F  ,  +  —  -  F  ,  we  get  that  the  influence  function  is 

n  n-i  n  n 

approximately  n(T(Fn)  -  T(Fn_j))  as  claimed. 

Given  a  sample  of  observations,  the  influence  function  for  a  parameter  of 
Interest  may  be  estimated  for  each  observation.  An  observation  which  has  a 
very  large  estimated  Influence*  will  deserve  particular  attention,  and  we  may 
be  better  off  to  discount  it  in  our  estimation  procedure.  If  it  is  necessary 
for  such  an  observation  to  be  replaced,  we  may  choose  a  value  with  small  or 
zero  Influence  on  our  estimator.  The  next  section  contains  some  common 
examples  of  this  concept. 


*tfhat  constitutes  a  large  estimated  Influence  depends  on  what  assumptions  are 
made  about  the  underlying  distribution. 
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In  orbit  determination  problems,  least-squares  fitting  methods  are  used 
to  estimate  orbit  parameters  based  on  data  such  as  pseudo-range,  delta  range, 
and  azimuth  and  elevation  angle  measurements.  It  is  well  known  that  the 
least-squares  procedure  leads  to  parameter  estimates  that  can  be  very  sensi¬ 
tive  to  outlying  observations.  Consequently,  robust  filtering  or  outlier 
rejection  techniques  sometimes  need  to  be  used  to  obtain  good  estimates  of 
orbital  elements  on  the  basis  of  such  data.  The  methodology  propoaed  in  this 
report  can  be  used  to  arrive  at  more  sophiatlcated  outlier  rejection  and 
replacement  techniques  for  the  processing  of  these  data. 

The  literature  on  Influence  functions  is  growing  rapidly  with  new  appli¬ 
cations  to  estimation,  hypothesis  testing,  and  outlier  detection  appearing 
regularly.  Chernlck  (Ref.  2)  mentions  an  application  to  the  validation  of 
energy  data  and  computes  an  influence  function  for  multiple  correlation  in  a 
special  case.  Chernlck,  Downing  and  Pike  (Ref.  3)  Introduce  an  influence 
function  matrix  for  the  autocorrelation  function  which  can  be  applied  to 
detect  outliers  in  time  series.  Reid  (Ref.  4)  determines  an  Influence 
function  for  the  Kaplan-Meler  estimator  of  a  survival  curve  and  uses  it  to 
obtain  the  asymptotic  variance  of  that  estimator.  Their  use  for  hypothesis 
testing  is  proposed  by  Lambert  (Ref.  5).  Devlin,  Gnanadeslkan,  and  Kiettenrlng 
(Ref.  6)  illustrate  the  potential  use  of  the  Influence  function  for  detecting 
outlying  multivariate  observations. 
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2.  EXAMPLES 


w 


2.1  THE  MEAN  OP  A  DISTRIBUTION 

Por  a  univariate  distribution  function  P,  the  nean  can  be  written  as 

•  n 

T(F)  -  /  ydP  and  the  sample  mean  as  X  ■  —  I  X.  -  T(F  ), 

—  B  i-1  1  * 

where  we  assume  |T(F)|  <  -  and  {x.}  a  are  the  m  independent  observations 

1  1-1 

and  Fn  Is  the  empiric  distribution  function  on  the  basis  of  these 
observations.  In  this  case 


(1-e)  ydF  +  ex  -  f  ydF 

I(F,  T(F)t  x)  -  11m - - - 

e+0 

m 

-  x  -  /  ydF  -  T(F)  »  x  -  v 

mm 


(1) 


Replacing  u  by  X,  we  obtain  a  sample  estimate  for  I 

i  -  x  -  x. 

A 

This  estimate  of  I  Is  unbiased  and  consistent.  Large  values  of  I  correspond 
to  observations  which  are  say  2  or  3  standard  deviations  away  from  the  mean. 

So  the  Influence  function  for  the  nean  is  equivalent  to  a  3  sigma  outlier 
rejection  rule.  If  the  observations  have  a  normal  distribution,  the  proba¬ 
bility  of  the  Influence  function  estimate  exceeding  3  standard  deviations  Is 
less  than  0.01. 

2.2  THE  VARIANCE  OF  DISTRIBUTION 

2 

The  variance,  o  ,  of  a  univariate  distribution  function  F  can  be  written 

T(F)  "  .„/**  (y-F)2  dP  «od  the  sample  variance  a*  -  —  Z  (X.  -  X)*  -  T(F  ) 

■  1-1  1 
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3. 


►i 


Again,  this  is  Meaningful  only  if  we  as  suae  T(F)  <  The  influence  function 
is  given  by  «F,T(F),x)  -  <x-y)2  -  T(F)  -  <x-y)2  -  o2.  (2) 

A  saaple  estleete  could  be 

*  -  2  2 
I  -  (x-X)  -  a . 

2 

This  estleete  is  consistent  but  has  bias  a  /■•  Note  that  large  positive  values 
of  I  again  correspond  to  Xs  which  are  3  or  aore  standard  deviations  away  fron 
the  mean.  However,  interestingly,  the  largest  negative  influence  occurs  at 
x-X 


2.3  THE  BIVARIATE  COEFFICIENT  OF  CORRELATION  FOR  A  BIVARIATE 
DISTRIBUTION  FUNCTION 

Here  x  -  (xj,  x2)  is  a  two-dinanslonal  vector. 


T(F)  - 


-J*  -J~»1  "  -  -J'1!  dy|  -J"» 2  "2 

"i  -  <-T*i  «i>2)  lj~4  "2  -  (~r*2  <*p2>2) 


where  F  is  the  bivariate  distribution  defined  by 

FCXj.Xj)  -  P[Xj  <  Xj,  X2  <  x2l 

end  Pl^xp  "  *(*!»*>»  F2(*2*  "  P^"*  x2*‘ 

In  this  case,  the  influence  function  is 

2  +  2 

I(F,  T(F),  x)  -  yjy2  -  - “)  (3) 

•  X1  ”  W1  *2  w2 
where  y.  -  — — —  —  and  y«  -  ,  y.  is  the  seen  of  F. , 

°l  1  °2  1  1 

2  2 
Oj  is  the  variance  of  F^,  y2  is  the  aean  of  F2,  is  the  varlence 
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of  P2»  and  p  -  T(F).  An  estimate  I  can  be  obtained  by  replacing  p,  ^ » 

Op  and  with  some  estimates  In  Eq.  (3)*  For  e  derivation  of  Eq.  (3)  see 
Chernlck  (Ref.  2).  Gnanadeslkan  (Ref.  7)  points  out  that  for  bivariate  normal 
data,  the  estimated  Influence  function  for  the  Z  transformation  of  the  corre¬ 
lation  coefficient  has  approximately  a  product  standard  normal  distribution. 
This  distribution  can  be  used  to  determine  significantly  large  values  for  I. 

2.4  SLOPE  AND  INTERCEPT  PARAMETERS  IN  A  SIMPLE  LINEAR  REGRESSION 


Here  we  assume 

E(Y|X  -  x)  -  a  +  e  X 
2  2 

and  let  v  -  E(y),  y  -  E(x),  a  ■  Var  Y  and  o  ■  Var  X.  X  and  Y  have  a 
y  x  y  x 

bivariate  distribution  function  with  finite  second  moments 


Let  p  •  ?ov,  ,  He  have  from  Eq.  (4) 


°*  ^ 


Therefore 


“  ’  “y  ‘  *V 


a  I(P»<L»y)  I(F » <L»*) 

l(F,B»(x,y))  -  l(F,p,(x,y) )  +  p  ( - - cry - j -  ) 


a  I(F,cJ.y>  «*»<£  »x) 

l(F,p,(x,y))  +  p(  -  <Jy— -p - 


and  from  Eqs.  (2)  and  (3) 


i(*,Mx,y])  -  / 

x 


<-?)!) 


+  p 


[  (y  -  Vy)2  -  Pyl  [(X  -  Ux)2  -  <^]Oy 


Too 

y  * 


2o_ 


(8) 


From  Eq.  (6)  we  gat 

l(F »®»<*»y))  -  (y  -  Wy)  -  8(x  -  ux)  -  l»x  l(F,B,(x,y))  (9) 

From  Eqa.  (8)  and  (9)  w  see  that  those  Influence  functions  can  be 

estimated  by  obtaining  aaaple  estlaates  of  y  ,|i  ,o  ,o  ,  and  P.  For  the 

x  y  x  y 

regression  parameters  w  see  that  the  Influence  function  depends  on  the  same 
paraaeters  as  for  the  correlation  coefficient. 

Techniques  for  determining  Influential  observations  and  leverage  points 
In  regression  problems  are  given  In  Belsley,  Kuh,  and  Helsch  (Ref.  8). 

Because  the  classical  approach  to  orbit  determination  Involves  a  linearisation 
which  leads  to  solving  a  large  regression  problem  (1 .e. ,  estimating  six  or 
more  parameters)  the  techniques  given  by  Belsley.  Kuh.  and  Helsch  could  be 
useful.  Also,  an  Influence  function  for  the  regression  parameters  could  be 
calculated  similarly  to  the  calculation  Illustrated  here  for  the  simple  linear 
regression. 

A 

In  each  of  the  four  examples  given  In  this  section,  the  estimator  I  will 
be  consistent  for  I  If  consistent  estimates  of  the  unknown  parameters  are 
used.  Also.  If  the  maximum  likelihood  estimates  are  used  for  the  unknown 

A 

parameters  the  estimator  I  will  be  a  maximum  likelihood  estimator  of  I. 


3.  OUTLIER  DETECTION  AND  REPLACEMENT  OF  MISSING  OBSERVATIONS 


In  Gnanadesikan  (Ref.  7),  it  Is  shown  how  contours  of  constant  influence 
based  on  Eq.  (3)  can  be  used  as  a  graphical  tool  for  detecting  outliers  with 
respect  to  bivariate  correlation.  Chernick,  Downing,  and  Pike  (Ref.  3)  use 
Influence  function  estimates  for  the  autocorrelation  function  of  a  time  series 
to  determine  outliers. 

Here  we  propose  the  use  of  the  influence  function  to  determine  observa¬ 
tion  values  for  replacing  outliers  or  for  replacing  missing  observations. 
Observations  with  unduly  high  influence  should  be  replaced  by  values  which 
have  little  or  no  influence  on  the  estimated  parameter  or  parameters.  The 
philosophy  is  that  if  an  observation  needs  to  be  replaced  and  no  additional 
information  is  available  about  what  the  correct  value  should  be,  then  one 
should  choose  a  value  that  does  not  influence  estimates  of  importance  to  the 
users  of  the  data.  All  this  assumes  that  there  is  a  need  to  replace  the 
outlier  or  to  fill  in  a  missing  observation. 

He  shall  now  illustrate  this  approach  for  the  case  when  the  estimate  of 
interest  is  the  population  variance  (l.e.,  example  Eq.  (2)). 

Here  we  assume  that  we  have  a  sample  of  siie  n  and  we  are  concerned  that 
we  might  have  one  or  two  outliers.  By  rearranging  the  numbering  of  the 
observations,  if  one  outlier  is  detected,  we  assume  for  notatlonal  convenience 
that  it  is  the  nth  observation. 

Suppose  in  the  case  of  one  outlier  that  we  observed  a  value 

-  2 

X  *X  .  +  3  S  . .  The  influence  function  estimate  for  both  the  mean  and 
n  n-1  n-1 


the  variance  at  X(|  will  be  large  Indicating  that  the  observation  is  an 
outlier,  we  shall  replace  X,j  with  3^. 


Let 


X 


n 


i 

n 


n-1 


I 

i-1 


A 

X 


n 


(10) 


11 


1  n_i 

x  .  -  —if  y  x . , 

n-l  n-1  . J,  i 


,  n-l 

n=T  ij1  (Xi  "  *n-iy 


S2  - 
n 


1  «  O  1  A  •% 

-  I  (X  -  XT  +  i  (X  -  x  r, 
a  .  L,  1  n  n  n  n  * 


where  X  is  our  choice  for  «  replacement  to  X.*  From  Eqs.  (10)  and  (11) 
n  n 

we  get 


Xn  '  Vl 


X  -  X  , 
n  n-l 


2  2 
S  -  S  - 
n  n-l 


~Sn-l  .  n-l 

~~n~  +  ~3 


n-l  f~  y  v  2 

“2  (Xn  ‘  W  * 
n 


2  _  „2 


Ideally,  we  would  like  to  choose  Xn  so  that  Xn  -  X^_^  and  Sn  ■  ®n-i»  however. 


we  cannot  quite  do  this.  If  we  choose  X  "X  , ,  then  X  ■  X_  . .  However, 

n  n— i  n  n— x 

choosing  X  ■  X  ,  tends  to  make  S2  <  S2  In  fact,  from  Eq.  (13)  we  see 

n  n— i  n  n— x 

_s2 

that  S2  -  S2  ,  ■  -n~i  .  On  the  other  hand,  Eq.  (2)  tells  us  that 
n  n— x  n 

I(F,T(F),x)  -  0  If  (x  -  y)2  -  o2  or  x  -  y±  a  .  Consequently,  one  would 

*  *  — 

suspect  that  the  choice  X  "X  ,+S  .  or  X  ■  X  ,-S  ,  would  have  a 

r  n  n-l  n-l  n  n-l  n-l 

2 

smaller  influence  on  the  estimate  of  o  .  This  suspicion  Is  borne  out  by  the 
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The  price  paid  for  reducing 


JVl 

X  ,  +  s  . , 
n-1  n-1 * 

Y 

■»  Y  ms 

A 

n 

A  *  ■ 

n-1 

n 

“Sn-1 

n 

X  -  S  , , 

n-1  n-1 * 

X 

n 

-  X  ,  - 
n-1 

2  2  "Sn-1 

fact  that  for  either  choice  S  -  S  ,  -  — *—  . 

n  n-i  z 

n 

the  influence  on  the  variance  is  an  influence  on  the  mean.  When 

and  when 

.  A  slight  modification  of  this 

2 

replacement  for  Xn  leads  to  a  zero  change  In  the  estimate  of  a  •  If  we  solve 

the  equation  ,  for  X  ,  we  find  X  ■  X  ,  ±  \  /-^V  S  ,  .  In  the 

n  n  n-1  n*  n  n-1  y  n-1  n_1 

A  M.  ^  ^  S  ■ 

case  X  •  X  .  ~K /■  .  S  ,  we  have  X  ■  X  .  ■  snd  when 

n  n-1  vn-l  n-1  n  n-1  yfc(n-l) 

_  _  _  -s  ,  • 

X  ■  X  .  >/'  ”  *  S  ,  we  have  X  —  X  ,  ■ 

n  n-1  -n-1  n-1  n  n-1  ynin-ij 

the  case  of  two  outliers. 

Theorem  1 

In  the  case  of  two  outliers ,  say  Xn_j  and  Xn,  we  can  choose 


.  We  shall  now  consider 


and 


where 


Vl  -  Xn-2  +  Sn-2 


Xn  “  Xn-2  -  Sn-2 


1 


n-2 


Xn-2-^T  Jml  Xi 


and 


n-2 


n-2  '  ^ I  J,  <Xi  -  W 


-  -  2  2 

In  this  case  we  have  X  -  X  _  ■  0  and  S  -  S  0  ■  0. 

n  n— i  n  n— t 
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proof 


n 


i  B“2 

-  I  l 

n  . . 

i-i 


*i  +  <*„  +  Vi,/n 


(n-2)  ; 

~  n-2 


+  <*«  ♦ 


A  A 


Our  choice  of  X  -  X  .  makes  (X„  +  X  . )/n  -  2X,/n.  Consequently, 
n  n-i  n  n— i  n— z 


X  -  X 
n  n-2 


(14) 


From  Eq.  (14),  we  see  that 

*1  ■  ;  ^  <xj  -  V2>2  +  <Vi  '  V2)2/" 


+  «.  -  V2>2'"  xi 2  +  ^ 


05) 


-  S 


n-2. 


A  ^  A 

We  note  that  as  long  as  we  choose  Xn  -  2Xn_2  -  Xn_j  Eq.  (14)  will  be 

satisfied.  But  in  order  to  also  obtsin  Eq.  (15),  we  aust  have 

X  ,  -  X  S  _.  We  are  fortunate  that  in  this  situation,  because  of 
n— i  n— z  n— t 

symmetry,  we  can  find  replacement  values  which  leave  both  the  mean  and 
variance  estimates  unchanged. 


In  the  case  of  one  outlier,  we  cannot  do  this.  If  estlmetlng  the  mean  Is 
Important,  and  we  do  not  care  about  the  estimate  of  variance,  one  should 

choose  X  -  X  ,  .  On  the  other  hand,  If  the  estimate  of  variance  Is  much 
n  n-1 

A  —  A  — 

more  Important,  one  should  use  Xq  -  Vl  +  Vl  «  X„  -  Vl  -  Vl*  ’rttt 

the  choice  smong  these  two  estimates  dlctsted  by  whether  Xn  was  larger^  than 

X  or  not  (In  the  case  when  the  outlying  observstlon  X_  is  known).  If  both 
n— l  *■ 

parameters  sre  Important  In  the  estimation,  a  compromise  choice  for  XQ  should 

be  chosen  perhaps  by  minimising  a  weighted  average  of  the  absolute  value  (or 

square)  of  the  estimated  Influence  functions. 
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The  approach  of  minimizing  a  weighted  average  of  the  absolute  value  (or 
square)  of  the  Influence  function  can  be  generalized  to  the  case  of  several 
parameters.  Observations  will  be  declared  outliers  If  they  have  an  unduly 
large  Influence  on  any  of  the  Important  parameters  and  they  will  be  replaced 
by  values  that  minimize  their  average  absolute  or  average  square  Influence. 

The  methodology  described  here  for  the  variance  could  also  be  applied  In 
the  case  of  the  correlation  coefficient  or  the  regression  parameters  or  for 
combinations  of  these  parameters. 


4.  AM  EXAMPLE  OF  POWER  PLANT  DATA 


The  Department  of  Energy  collects  aonthly  date  on  fuel  consuaptlon  and 
electricity  generation  for  all  utilities  and  sobs  Industrial  plants.  For  two 
particular  plants,  three  years  of  aonthly  data  were  analysed  and  outliers  were 
found  using  the  Influence  function  for  blvsrlate  correlation  (Chernlck  [Ref.  2]). 
Subsequently,  this  data  was  used  to  Illustrate  a  new  tlae  series  technique  for 
detecting  outliers  based  on  Influence  functions  (Chernlck,  et  al.  [Ref.  3]). 

Table  1  presents  the  36  values  of  the  consuaptlon  data  for  one  of  the  plants. 
This  table  also  Includes  values  for  the  Influence  function  estimates  for  the 
aean  and  variance  of  the  saaple  at  each  observation  point.  Notice  that  obser¬ 
vation  no.  23  has  the  largest  Influence  on  both  the  aean  and  variance.  In 
computing  the  Influence  function  for  the  correlation  between  consuaptlon  and 
generation,  observation  no.  23  also  stood  out.  Note  that  for  the  variance, 
the  observations  closest  to  the  saaple  aean  have  the  largest  negative  Influ¬ 
ence  although  these  observations  have  the  smallest  Influence  on  the  mean. 

Assuming  that  one  wanted  to  Impute  a  value  to  observation  no.  23,  we 
would  choose  the  value  9.3  If  we  wanted  to  leave  the  aean  unchanged.  If  we 
wanted  to  leave  the  variance  unchanged,  we  use  either  9.3  +  0.38  -  9.68  or 
9.3  -  0.38  -  8.92.  Because  the  consuaptlon  and  generation  data  are  highly 
correlated  and  the  generation  data  was  not  suspect  In  this  case,  an  estimate 
for  the  consumption  data  on  the  basis  of  regression  or  on  the  Influence  func¬ 
tion  for  bivariate  correlation  would  be  store  appropriate  In  this  application. 


Table  1.  Influence  Function  Estimates  for  Power  Plant  Data 


Observation 

Number 


Influence  Functions 


Consumption 


Mean 

Variance 

59.4 

3167.3 

-7.6 

-303.3 

-8.6 

-287.1 

-6.6 

-317.5 

-7.6 

-303.3 

-8.6 

-287. 1 

36.4 

963.9 

—0.6 

-360.7 

-0.6 

-360.7 

-6.6 

-317.5 

-7.6 

-303.3 

4.4 

-341.7 

-8.6 

-287.1 

-7.6 

-303.3 

-5.6 

-329.7 

3.4 

-349.5 

-6.6 

-317.5 

-8.6 

-287.1 

-7.6 

-303.3 

1.4 

-359. 1 

-5.6 

-329.7 

-6.6 

-317.5 

81.4 

6264.9 

-7.6 

>3.3 

-3.6 

-34 

>8.1 

-7.6 

-6.6 

-31 

17.5 

-6.6 

-31 

L7.5 

-7.6 

-5.6 

-3: 

59.7 

7.5 

>6.3 

-7.6 

-3( 

-8.6 

-21 

>7.1 

-8.6 

-21 

17.1 

-6.6 

-3] 

L  7. 5 

-7.6 

-3< 

33. 3 
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